lder class
Benign overfitting and adaptive nonparametric regression
Chhor, Julien, Sigalla, Suzanne, Tsybakov, Alexandre B.
Benign overfitting has attracted a great deal of attention in the recent years. It was initially motivated by the fact that deep neural networks have good predictive properties even when perfectly interpolating the training data [Belkin et al., 2019a], [Belkin et al., 2018b], [Zhang et al., 2021], [Belkin, 2021]. Such a behavior stands in strong contrast with the classical point of view that perfectly fitting the data points is not compatible with predicting well. With the aim of understanding this new phenomenon, a series of recent papers studied benign overfitting in linear regression setting, see [Bartlett et al., 2020], [Tsigler and Bartlett, 2020], [Chinot and Lerasle, 2020], [Muthukumar et al., 2020], [Bartlett and Long, 2021], [Lecué and Shang, 2022] and the references therein. The main conclusion for the linear model is that an unbalanced spectrum of the design matrix and over-parametrization, which in a sense approaches the model to non-parametric setting, are essential for benign overfitting to occur in linear regression. Extensions to kernel ridgeless regression were considered in [Liang and Rakhlin, 2020] when the sample size n and the dimension d were assumed to satisfy n null d, and in [Liang et al., 2020] for a more general case d null n
Bounding the expectation of the supremum of empirical processes indexed by H\"older classes
We obtain upper bounds on the expectation of the supremum of empirical processes indexed by H\"older classes of any smoothness and for any distribution supported on a bounded set. Another way to see it is from the point of view of integral probability metrics (IPM), a class of metrics on the space of probability measures: our rates quantify how quickly the empirical measure obtained from $n$ independent samples from a probability measure $P$ approaches $P$ with respect to the IPM indexed by H\"older classes. As an extremal case we recover the known rates for the Wassertein-1 distance.
Robust Nonparametric Regression under Huber's $\epsilon$-contamination Model
Du, Simon S., Wang, Yining, Balakrishnan, Sivaraman, Ravikumar, Pradeep, Singh, Aarti
We consider the non-parametric regression problem under Huber's $\epsilon$-contamination model, in which an $\epsilon$ fraction of observations are subject to arbitrary adversarial noise. We first show that a simple local binning median step can effectively remove the adversary noise and this median estimator is minimax optimal up to absolute constants over the H\"{o}lder function class with smoothness parameters smaller than or equal to 1. Furthermore, when the underlying function has higher smoothness, we show that using local binning median as pre-preprocessing step to remove the adversarial noise, then we can apply any non-parametric estimator on top of the medians. In particular we show local median binning followed by kernel smoothing and local polynomial regression achieve minimaxity over H\"{o}lder and Sobolev classes with arbitrary smoothness parameters. Our main proof technique is a decoupled analysis of adversary noise and stochastic noise, which can be potentially applied to other robust estimation problems. We also provide numerical results to verify the effectiveness of our proposed methods.